# Load packages
library(bayesrules)
library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.1 ──
## ✓ ggplot2 3.3.5     ✓ purrr   0.3.4
## ✓ tibble  3.1.4     ✓ dplyr   1.0.7
## ✓ tidyr   1.1.3     ✓ stringr 1.4.0
## ✓ readr   2.0.1     ✓ forcats 0.5.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()
library(rstan)
## Loading required package: StanHeaders
## rstan (Version 2.21.2, GitRev: 2e1f913d3ca3)
## For execution on a local, multicore CPU with excess RAM we recommend calling
## options(mc.cores = parallel::detectCores()).
## To avoid recompilation of unchanged Stan programs, we recommend calling
## rstan_options(auto_write = TRUE)
## 
## Attaching package: 'rstan'
## The following object is masked from 'package:tidyr':
## 
##     extract
library(rstanarm)
## Loading required package: Rcpp
## This is rstanarm version 2.21.1
## - See https://mc-stan.org/rstanarm/articles/priors for changes to default priors!
## - Default priors may change, so it's safest to specify priors, even if equivalent to the defaults.
## - For execution on a local, multicore CPU with excess RAM we recommend calling
##   options(mc.cores = parallel::detectCores())
## 
## Attaching package: 'rstanarm'
## The following object is masked from 'package:rstan':
## 
##     loo
library(bayesplot)
## This is bayesplot version 1.8.1
## - Online documentation and vignettes at mc-stan.org/bayesplot
## - bayesplot theme set to bayesplot::theme_default()
##    * Does _not_ affect other ggplot2 plots
##    * See ?bayesplot_theme_set for details on theme setting
library(tidybayes)
library(janitor)
## 
## Attaching package: 'janitor'
## The following objects are masked from 'package:stats':
## 
##     chisq.test, fisher.test
library(broom.mixed)

Exercise 9.1

a- A normal prior is reasonable here because there should be a mean/mode that are the same, with the spread roughly equal on each side

b- A normal prior is not reasonable here because sigma is an exponential distribution (and cannot be less than 0), which does not match with a normal distribution

c- A vague prior does not really show any knowledge of the problem/distribution and has a large variance (one example is the straight line), while a weakly informative prior has some very small intuition as to what the possible values for the data could be.

Exercise 9.2

a- X= arm length; Y= height

b- X= distance between home+work; Y= carbon footprint (annual CO2 emissions)

c- X= age; Y= vocabulary level

d- X= sleep habits; Y= reaction time

Exercise 9.3

a- Bo would be the height of a baby kangaroo when it is first born and B1 would be the increase in height for each month increase in age for the kangaroo; this B1 is likely positive, as the older you get, the taller you become on average

b- Bo would be the number of GitHub followers someone has when they have 0 commits in the past week and B1 would be the increase in followers someone has per each commit they gain in a week; this B1 is likely positive, as researchers who contribute more should have more followers

c- Bo would be the number of visitors to a local park on a day with 0 inches of rainfall and B1 would be the change in visitors to a park per each inch of rain; this B1 is likely negative, as people are less likely to want to go to a park when it is raining

d- Bo would be the number of hours of Netflix that a person watches if they never slept (which would not be good!) and B1 would be the change in hours of Netflix that a person watches as they sleep another hour; I would estimate that this is likely negative since people tend to watch Netflix at night, and if they are sleeping more there are less hours in the evening available for them to watch Netflix.

Exercise 9.4

As the strength of the relationship between Y and X increases (becomes more strong), the variance should decrease as the spread should be smaller (since X should be better at predicting Y).

Exercise 9.5

a- X= age in years; Y= annual orange juice consumption in gallons

b- Y= N(u, sigma^2)

c- Y= Bo+B1(X)

d- All the unknown parameters (including the model in b) is sigma^2, Bo, and B1. Bo and B1 could be any number, negative or positive (but some values are more likely than others) while sigma^2 is only positive.

e- Since I am very unsure of Bo or B1, I would have their sigma values/variance be relatively high- my estimate for Bo would be roughly .5, with B1 being roughly .25. Meanwhile I have absolutely no idea of sigma but I would expect it to have a high spread- if I want it to cover 2 standard deviation

Exercise 9.6

a- X= today’s high temp; Y= tomorrow’s high temp

b- Y= N(u, sigma^2)

c- Y= Bo+B1(X)

d- All the unknown parameters are sigma^2, Bo, and B1. Bo and B1 could be any number, negative or positive (but some values are more likely than others) while sigma^2 is only positive.

e- Since the relationship between Bo and B1 seems to be relatively strong, I would estimate that the sigma value would be smaller, and similarly I think the sigma for Bo and B1 would be low. Bo I would estimate to be about 0, since if the day before’s high temp was 0 I would estimate the next day would be around 0. I would estimate B1 to be about 1, since I expect it to be very close to a 1:1 relationship.

Exercise 9.7

a- False

b- True

Exercise 9.8

a- bunnies_model <- stan_glm(height ~ age, data = bunnies, family = gaussian, prior_intercept = normal(10, 5), prior = normal(0, 2.5), prior_aux = exponential(0.0008), chains = 4, iter = 50002, seed = 84735) b- songs_model <- stan_glm(clicks ~ snaps, data = songs, family = gaussian, prior_intercept = normal(0, 2.5), prior = normal(0, 2.5), prior_aux = exponential(0.0008), chains = 4, iter = 50002, seed = 84735)

c- dogs_model <- stan_glm(happiness ~ age, data = dogs, family = gaussian, prior_intercept = normal(0, 2.5), prior = normal(0, 2.5), prior_aux = exponential(0.0008), chains = 4, iter = 5000*2, seed = 84735)

Exercise 9.9

a- Y|Bo, B1 and sigma ~ N(u, sigma^2); u=Bo+B1(X) Bo ~ N(5000, 2000^2) B1 ~ N(-10, 5^2) sigma^2 ~ Exponential(.0005)

b- Now we can simulate the normal regression prior model:

bikes_model <- stan_glm(rides ~ humidity, data = bikes,
                       family = gaussian,
                       prior_intercept = normal(5000, 2000), 
                       prior = normal(-10, 5), 
                       prior_aux = exponential(0.0005),
                       chains = 5, iter = 8000, seed = 84735,
                       prior_PD = TRUE)
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 1).
## Chain 1: 
## Chain 1: Gradient evaluation took 7e-05 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.7 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1: 
## Chain 1: 
## Chain 1: Iteration:    1 / 8000 [  0%]  (Warmup)
## Chain 1: Iteration:  800 / 8000 [ 10%]  (Warmup)
## Chain 1: Iteration: 1600 / 8000 [ 20%]  (Warmup)
## Chain 1: Iteration: 2400 / 8000 [ 30%]  (Warmup)
## Chain 1: Iteration: 3200 / 8000 [ 40%]  (Warmup)
## Chain 1: Iteration: 4000 / 8000 [ 50%]  (Warmup)
## Chain 1: Iteration: 4001 / 8000 [ 50%]  (Sampling)
## Chain 1: Iteration: 4800 / 8000 [ 60%]  (Sampling)
## Chain 1: Iteration: 5600 / 8000 [ 70%]  (Sampling)
## Chain 1: Iteration: 6400 / 8000 [ 80%]  (Sampling)
## Chain 1: Iteration: 7200 / 8000 [ 90%]  (Sampling)
## Chain 1: Iteration: 8000 / 8000 [100%]  (Sampling)
## Chain 1: 
## Chain 1:  Elapsed Time: 0.702431 seconds (Warm-up)
## Chain 1:                0.146233 seconds (Sampling)
## Chain 1:                0.848664 seconds (Total)
## Chain 1: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 2).
## Chain 2: 
## Chain 2: Gradient evaluation took 1.2e-05 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.12 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2: 
## Chain 2: 
## Chain 2: Iteration:    1 / 8000 [  0%]  (Warmup)
## Chain 2: Iteration:  800 / 8000 [ 10%]  (Warmup)
## Chain 2: Iteration: 1600 / 8000 [ 20%]  (Warmup)
## Chain 2: Iteration: 2400 / 8000 [ 30%]  (Warmup)
## Chain 2: Iteration: 3200 / 8000 [ 40%]  (Warmup)
## Chain 2: Iteration: 4000 / 8000 [ 50%]  (Warmup)
## Chain 2: Iteration: 4001 / 8000 [ 50%]  (Sampling)
## Chain 2: Iteration: 4800 / 8000 [ 60%]  (Sampling)
## Chain 2: Iteration: 5600 / 8000 [ 70%]  (Sampling)
## Chain 2: Iteration: 6400 / 8000 [ 80%]  (Sampling)
## Chain 2: Iteration: 7200 / 8000 [ 90%]  (Sampling)
## Chain 2: Iteration: 8000 / 8000 [100%]  (Sampling)
## Chain 2: 
## Chain 2:  Elapsed Time: 0.484445 seconds (Warm-up)
## Chain 2:                0.138032 seconds (Sampling)
## Chain 2:                0.622477 seconds (Total)
## Chain 2: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 3).
## Chain 3: 
## Chain 3: Gradient evaluation took 1.5e-05 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.15 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3: 
## Chain 3: 
## Chain 3: Iteration:    1 / 8000 [  0%]  (Warmup)
## Chain 3: Iteration:  800 / 8000 [ 10%]  (Warmup)
## Chain 3: Iteration: 1600 / 8000 [ 20%]  (Warmup)
## Chain 3: Iteration: 2400 / 8000 [ 30%]  (Warmup)
## Chain 3: Iteration: 3200 / 8000 [ 40%]  (Warmup)
## Chain 3: Iteration: 4000 / 8000 [ 50%]  (Warmup)
## Chain 3: Iteration: 4001 / 8000 [ 50%]  (Sampling)
## Chain 3: Iteration: 4800 / 8000 [ 60%]  (Sampling)
## Chain 3: Iteration: 5600 / 8000 [ 70%]  (Sampling)
## Chain 3: Iteration: 6400 / 8000 [ 80%]  (Sampling)
## Chain 3: Iteration: 7200 / 8000 [ 90%]  (Sampling)
## Chain 3: Iteration: 8000 / 8000 [100%]  (Sampling)
## Chain 3: 
## Chain 3:  Elapsed Time: 0.615098 seconds (Warm-up)
## Chain 3:                0.11129 seconds (Sampling)
## Chain 3:                0.726388 seconds (Total)
## Chain 3: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 4).
## Chain 4: 
## Chain 4: Gradient evaluation took 1.3e-05 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.13 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4: 
## Chain 4: 
## Chain 4: Iteration:    1 / 8000 [  0%]  (Warmup)
## Chain 4: Iteration:  800 / 8000 [ 10%]  (Warmup)
## Chain 4: Iteration: 1600 / 8000 [ 20%]  (Warmup)
## Chain 4: Iteration: 2400 / 8000 [ 30%]  (Warmup)
## Chain 4: Iteration: 3200 / 8000 [ 40%]  (Warmup)
## Chain 4: Iteration: 4000 / 8000 [ 50%]  (Warmup)
## Chain 4: Iteration: 4001 / 8000 [ 50%]  (Sampling)
## Chain 4: Iteration: 4800 / 8000 [ 60%]  (Sampling)
## Chain 4: Iteration: 5600 / 8000 [ 70%]  (Sampling)
## Chain 4: Iteration: 6400 / 8000 [ 80%]  (Sampling)
## Chain 4: Iteration: 7200 / 8000 [ 90%]  (Sampling)
## Chain 4: Iteration: 8000 / 8000 [100%]  (Sampling)
## Chain 4: 
## Chain 4:  Elapsed Time: 0.556894 seconds (Warm-up)
## Chain 4:                0.11192 seconds (Sampling)
## Chain 4:                0.668814 seconds (Total)
## Chain 4: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 5).
## Chain 5: 
## Chain 5: Gradient evaluation took 1.3e-05 seconds
## Chain 5: 1000 transitions using 10 leapfrog steps per transition would take 0.13 seconds.
## Chain 5: Adjust your expectations accordingly!
## Chain 5: 
## Chain 5: 
## Chain 5: Iteration:    1 / 8000 [  0%]  (Warmup)
## Chain 5: Iteration:  800 / 8000 [ 10%]  (Warmup)
## Chain 5: Iteration: 1600 / 8000 [ 20%]  (Warmup)
## Chain 5: Iteration: 2400 / 8000 [ 30%]  (Warmup)
## Chain 5: Iteration: 3200 / 8000 [ 40%]  (Warmup)
## Chain 5: Iteration: 4000 / 8000 [ 50%]  (Warmup)
## Chain 5: Iteration: 4001 / 8000 [ 50%]  (Sampling)
## Chain 5: Iteration: 4800 / 8000 [ 60%]  (Sampling)
## Chain 5: Iteration: 5600 / 8000 [ 70%]  (Sampling)
## Chain 5: Iteration: 6400 / 8000 [ 80%]  (Sampling)
## Chain 5: Iteration: 7200 / 8000 [ 90%]  (Sampling)
## Chain 5: Iteration: 8000 / 8000 [100%]  (Sampling)
## Chain 5: 
## Chain 5:  Elapsed Time: 0.723983 seconds (Warm-up)
## Chain 5:                0.18021 seconds (Sampling)
## Chain 5:                0.904193 seconds (Total)
## Chain 5:

c-

# 100 simulated model lines
bikes |> 
  add_fitted_draws(bikes_model, n = 100) |> 
  ggplot(aes(x = humidity, y = rides)) +
    geom_line(aes(y = .value, group = .draw), alpha = 0.15) + 
    geom_point(data = bikes, size = 0.05)
## Warning: `fitted_draws` and `add_fitted_draws` are deprecated as their names were confusing.
## Use [add_]epred_draws() to get the expectation of the posterior predictive.
## Use [add_]linpred_draws() to get the distribution of the linear predictor.
## For example, you used [add_]fitted_draws(..., scale = "response"), which
## means you most likely want [add_]epred_draws(...).

Now we can plot the 4 datasets:

# Simulate four sets of data
set.seed(1234)
bikes |> 
  add_predicted_draws(bikes_model, n = 4) %>%
  ggplot(aes(x = humidity, y = rides)) +
    geom_point(aes(y = .prediction, group = .draw), size = 0.2) + 
    facet_wrap(~ .draw)
## Warning: 
## In add_predicted_draws(): The `n` argument is a deprecated alias for `ndraws`.
## Use the `ndraws` argument instead.
## See help("tidybayes-deprecated").

d- The prior understanding is that as humidity increases, ridership decreases- however, this is a very weak relationship with a high standard deviation (particularly in plots 1 and 3).

Exercise 9.10

a- Below is the plot of ridership v humidity in the data:

ggplot(bikes, aes(x = humidity, y = rides)) +
  geom_point() +
  geom_smooth(method = "lm")
## `geom_smooth()` using formula 'y ~ x'

In the bikes data, there appears to be a very slight negative relationship between humidity and rides.

b- I would say no, since the variability is so high that a majority of the points would not fall under the 2 standard deviations from the mean.

Exercise 9.11

a- Here we will use the update shortcut to simulate the Normal regression posterior model:

bikes_model_posterior <- update(bikes_model, prior_PD = FALSE)
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 1).
## Chain 1: 
## Chain 1: Gradient evaluation took 3.1e-05 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.31 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1: 
## Chain 1: 
## Chain 1: Iteration:    1 / 8000 [  0%]  (Warmup)
## Chain 1: Iteration:  800 / 8000 [ 10%]  (Warmup)
## Chain 1: Iteration: 1600 / 8000 [ 20%]  (Warmup)
## Chain 1: Iteration: 2400 / 8000 [ 30%]  (Warmup)
## Chain 1: Iteration: 3200 / 8000 [ 40%]  (Warmup)
## Chain 1: Iteration: 4000 / 8000 [ 50%]  (Warmup)
## Chain 1: Iteration: 4001 / 8000 [ 50%]  (Sampling)
## Chain 1: Iteration: 4800 / 8000 [ 60%]  (Sampling)
## Chain 1: Iteration: 5600 / 8000 [ 70%]  (Sampling)
## Chain 1: Iteration: 6400 / 8000 [ 80%]  (Sampling)
## Chain 1: Iteration: 7200 / 8000 [ 90%]  (Sampling)
## Chain 1: Iteration: 8000 / 8000 [100%]  (Sampling)
## Chain 1: 
## Chain 1:  Elapsed Time: 0.358183 seconds (Warm-up)
## Chain 1:                0.32836 seconds (Sampling)
## Chain 1:                0.686543 seconds (Total)
## Chain 1: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 2).
## Chain 2: 
## Chain 2: Gradient evaluation took 1.8e-05 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.18 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2: 
## Chain 2: 
## Chain 2: Iteration:    1 / 8000 [  0%]  (Warmup)
## Chain 2: Iteration:  800 / 8000 [ 10%]  (Warmup)
## Chain 2: Iteration: 1600 / 8000 [ 20%]  (Warmup)
## Chain 2: Iteration: 2400 / 8000 [ 30%]  (Warmup)
## Chain 2: Iteration: 3200 / 8000 [ 40%]  (Warmup)
## Chain 2: Iteration: 4000 / 8000 [ 50%]  (Warmup)
## Chain 2: Iteration: 4001 / 8000 [ 50%]  (Sampling)
## Chain 2: Iteration: 4800 / 8000 [ 60%]  (Sampling)
## Chain 2: Iteration: 5600 / 8000 [ 70%]  (Sampling)
## Chain 2: Iteration: 6400 / 8000 [ 80%]  (Sampling)
## Chain 2: Iteration: 7200 / 8000 [ 90%]  (Sampling)
## Chain 2: Iteration: 8000 / 8000 [100%]  (Sampling)
## Chain 2: 
## Chain 2:  Elapsed Time: 0.696356 seconds (Warm-up)
## Chain 2:                0.33091 seconds (Sampling)
## Chain 2:                1.02727 seconds (Total)
## Chain 2: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 3).
## Chain 3: 
## Chain 3: Gradient evaluation took 1.9e-05 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.19 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3: 
## Chain 3: 
## Chain 3: Iteration:    1 / 8000 [  0%]  (Warmup)
## Chain 3: Iteration:  800 / 8000 [ 10%]  (Warmup)
## Chain 3: Iteration: 1600 / 8000 [ 20%]  (Warmup)
## Chain 3: Iteration: 2400 / 8000 [ 30%]  (Warmup)
## Chain 3: Iteration: 3200 / 8000 [ 40%]  (Warmup)
## Chain 3: Iteration: 4000 / 8000 [ 50%]  (Warmup)
## Chain 3: Iteration: 4001 / 8000 [ 50%]  (Sampling)
## Chain 3: Iteration: 4800 / 8000 [ 60%]  (Sampling)
## Chain 3: Iteration: 5600 / 8000 [ 70%]  (Sampling)
## Chain 3: Iteration: 6400 / 8000 [ 80%]  (Sampling)
## Chain 3: Iteration: 7200 / 8000 [ 90%]  (Sampling)
## Chain 3: Iteration: 8000 / 8000 [100%]  (Sampling)
## Chain 3: 
## Chain 3:  Elapsed Time: 0.518491 seconds (Warm-up)
## Chain 3:                0.350939 seconds (Sampling)
## Chain 3:                0.86943 seconds (Total)
## Chain 3: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 4).
## Chain 4: 
## Chain 4: Gradient evaluation took 1.7e-05 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.17 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4: 
## Chain 4: 
## Chain 4: Iteration:    1 / 8000 [  0%]  (Warmup)
## Chain 4: Iteration:  800 / 8000 [ 10%]  (Warmup)
## Chain 4: Iteration: 1600 / 8000 [ 20%]  (Warmup)
## Chain 4: Iteration: 2400 / 8000 [ 30%]  (Warmup)
## Chain 4: Iteration: 3200 / 8000 [ 40%]  (Warmup)
## Chain 4: Iteration: 4000 / 8000 [ 50%]  (Warmup)
## Chain 4: Iteration: 4001 / 8000 [ 50%]  (Sampling)
## Chain 4: Iteration: 4800 / 8000 [ 60%]  (Sampling)
## Chain 4: Iteration: 5600 / 8000 [ 70%]  (Sampling)
## Chain 4: Iteration: 6400 / 8000 [ 80%]  (Sampling)
## Chain 4: Iteration: 7200 / 8000 [ 90%]  (Sampling)
## Chain 4: Iteration: 8000 / 8000 [100%]  (Sampling)
## Chain 4: 
## Chain 4:  Elapsed Time: 0.391171 seconds (Warm-up)
## Chain 4:                0.314959 seconds (Sampling)
## Chain 4:                0.70613 seconds (Total)
## Chain 4: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 5).
## Chain 5: 
## Chain 5: Gradient evaluation took 1.8e-05 seconds
## Chain 5: 1000 transitions using 10 leapfrog steps per transition would take 0.18 seconds.
## Chain 5: Adjust your expectations accordingly!
## Chain 5: 
## Chain 5: 
## Chain 5: Iteration:    1 / 8000 [  0%]  (Warmup)
## Chain 5: Iteration:  800 / 8000 [ 10%]  (Warmup)
## Chain 5: Iteration: 1600 / 8000 [ 20%]  (Warmup)
## Chain 5: Iteration: 2400 / 8000 [ 30%]  (Warmup)
## Chain 5: Iteration: 3200 / 8000 [ 40%]  (Warmup)
## Chain 5: Iteration: 4000 / 8000 [ 50%]  (Warmup)
## Chain 5: Iteration: 4001 / 8000 [ 50%]  (Sampling)
## Chain 5: Iteration: 4800 / 8000 [ 60%]  (Sampling)
## Chain 5: Iteration: 5600 / 8000 [ 70%]  (Sampling)
## Chain 5: Iteration: 6400 / 8000 [ 80%]  (Sampling)
## Chain 5: Iteration: 7200 / 8000 [ 90%]  (Sampling)
## Chain 5: Iteration: 8000 / 8000 [100%]  (Sampling)
## Chain 5: 
## Chain 5:  Elapsed Time: 0.393222 seconds (Warm-up)
## Chain 5:                0.326019 seconds (Sampling)
## Chain 5:                0.719241 seconds (Total)
## Chain 5:

b- Below are some MCMC diagnostics:

neff_ratio(bikes_model_posterior)
## (Intercept)    humidity       sigma 
##     0.98895     0.98095     0.95275
rhat(bikes_model_posterior)
## (Intercept)    humidity       sigma 
##    1.000068    1.000136    1.000145
mcmc_trace(bikes_model_posterior, size = .1)

mcmc_dens_overlay(bikes_model_posterior)

The diagnostics seem pretty good- the neff ratio is under 1, but only slightly so that should be fine. Otherwise the other diagnostics all look good!

c-

# 100 simulated model lines
bikes |> 
  add_fitted_draws(bikes_model_posterior, n = 100) |> 
  ggplot(aes(x = humidity, y = rides)) +
    geom_line(aes(y = .value, group = .draw), alpha = 0.15) + 
    geom_point(data = bikes, size = 0.05)
## Warning: `fitted_draws` and `add_fitted_draws` are deprecated as their names were confusing.
## Use [add_]epred_draws() to get the expectation of the posterior predictive.
## Use [add_]linpred_draws() to get the distribution of the linear predictor.
## For example, you used [add_]fitted_draws(..., scale = "response"), which
## means you most likely want [add_]epred_draws(...).

The posterior model shows stronger evidence for a negative relationship between the two variables (albeit still a weak one, with some of the lines being nearly flat)- compared to the prior model, this is a lot more cohesive (especially for the intercept, as the prior models lines were all over the place on the y axis).

Exercise 9.12

a-

tidy(bikes_model_posterior, effects = c("fixed", "aux"),
     conf.int = TRUE, conf.level = .95)
## # A tibble: 4 × 5
##   term        estimate std.error conf.low conf.high
##   <chr>          <dbl>     <dbl>    <dbl>     <dbl>
## 1 (Intercept)  4020.      225.     3574.    4460.  
## 2 humidity       -8.44      3.37    -15.1     -1.76
## 3 sigma        1574.       49.5    1482.    1678.  
## 4 mean_PPD     3485.       99.7    3289.    3681.

b- The posterior median value of the sigma parameter is 1573.8, which means that the posterior SD is roughly 1572 rides.

c- The 95% posterior credible interval for the humidity coefficient is -15.1 to -1.8, which means that 95% of the values in the distribution for this coefficient are within that range, suggesting that it is very likely that for each increase in units of humidity, the ridership on average will fall somewhere between 15.1 and 1.8.

d- Yes we do; since 0 and/or any positive values are not within the CI for the B1 value (humidity coefficient), it is highly likely that there is a negative association between ridership and humidity.

Exercise 9.13

a- First we will simulate the posterior model for the typical number of riders on 90% humidity days:

bikes_model_df <- as.data.frame(bikes_model_posterior)
first_set <- head(bikes_model_df, 1)
first_set
##   (Intercept)  humidity    sigma
## 1    4346.145 -14.14211 1576.984
mu <- first_set$`(Intercept)` + first_set$humidity * 90
mu
## [1] 3073.355

The typical number of riders on a 90% humidity day is 3073, but we now have to account for variability here:

set.seed(12345)
y_new <- rnorm(1, mean = mu, sd = first_set$sigma)
y_new
## [1] 3996.724

Now we can create the model for both the typical ridership and for ridership tomorrow:

set.seed(12345)
predict_90 <- bikes_model_df |> 
  mutate(mu = `(Intercept)` + humidity * 90,
         y_new = rnorm(20000, mean = mu, sd = sigma))
head(predict_90, 3)
##   (Intercept)   humidity    sigma       mu    y_new
## 1    4346.145 -14.142111 1576.984 3073.355 3996.724
## 2    3846.842  -7.067930 1525.479 3210.728 4293.004
## 3    3951.057  -8.151076 1659.455 3217.460 3036.076

The mu column here approximates the posterior model for typical ridership on 90% humidity days, and the y_new column approximates the posterior model for ridership tomorrow (an individual 90% humidity day).

b-

# Plot the posterior model of the typical ridership on 90% humidity days
ggplot(predict_90, aes(x = mu)) + 
  geom_density()

# Plot the posterior predictive model of tomorrow's ridership
ggplot(predict_90, aes(x = y_new)) + 
  geom_density()

While the actual shape of both of the curves is similar, the spread of the predictive model for tomorrow’s ridership is much larger (and also has a higher middle/peak value).

c- 80% CI:

# Construct 80% posterior credible intervals
predict_90 %>% 
  summarize(lower_mu = quantile(mu, 0.1),
            upper_mu = quantile(mu, 0.9),
            lower_new = quantile(y_new, 0.1),
            upper_new = quantile(y_new, 0.9))
##   lower_mu upper_mu lower_new upper_new
## 1  3113.44 3405.863  1236.876  5289.164

The 80% posterior prediction interval for the number of riders tomorrow is between 1237 and 5289.

d- We can closely follow the steps from the book here to use posterior_predict to confirm our earlier results:

# Simulate a set of predictions
set.seed(84735)
shortcut_prediction <- 
  posterior_predict(bikes_model_posterior, newdata = data.frame(humidity = 90))
posterior_interval(shortcut_prediction, prob = .8)
##        10%      90%
## 1 1260.288 5282.819
mcmc_dens(shortcut_prediction) +
  xlab("predicted ridership on a 90% humidity day")

The plot looks very similar, and the interval values are also similar but slightly adjusted (I am unsure as to why since it is still using the same data values and simulations).

Exercise 9.14

a- I would guess that there is a negative relationship between ridership and wind speeds (when wind speeds are higher, less riders would be out)

b- Below is the specification of the model in parts:

Y|Bo, B1, sigma ~ N(u, sigma^2) and u=Bo + B1(X)

Bo ~ N(5000, 1000) Since I am unsure of the value here, I am giving a large variability of 1000, and 5000 seems to be an acceptable estimate for average bike ridership when winds are average/at 0.

B1 ~ N(-200, 20) I’ll estimate here that for each increase in mph for wind speed, about 200 less people will ride their bikes (and give it an SD of about 20).

sigma ~ Exponential(.01) I’m expecting a strong relationship here, so I can give it an SD of about 100 (.01 in exponential terms)

c- Below is the simulated prior:

bikes_model2 <- stan_glm(rides ~ windspeed, data = bikes,
                       family = gaussian,
                       prior_intercept = normal(5000, 1000),
                       prior = normal(-200, 20), 
                       prior_aux = exponential(0.01),
                       chains = 4, iter = 8000, seed = 12345, prior_PD=TRUE)
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 1).
## Chain 1: 
## Chain 1: Gradient evaluation took 1.4e-05 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.14 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1: 
## Chain 1: 
## Chain 1: Iteration:    1 / 8000 [  0%]  (Warmup)
## Chain 1: Iteration:  800 / 8000 [ 10%]  (Warmup)
## Chain 1: Iteration: 1600 / 8000 [ 20%]  (Warmup)
## Chain 1: Iteration: 2400 / 8000 [ 30%]  (Warmup)
## Chain 1: Iteration: 3200 / 8000 [ 40%]  (Warmup)
## Chain 1: Iteration: 4000 / 8000 [ 50%]  (Warmup)
## Chain 1: Iteration: 4001 / 8000 [ 50%]  (Sampling)
## Chain 1: Iteration: 4800 / 8000 [ 60%]  (Sampling)
## Chain 1: Iteration: 5600 / 8000 [ 70%]  (Sampling)
## Chain 1: Iteration: 6400 / 8000 [ 80%]  (Sampling)
## Chain 1: Iteration: 7200 / 8000 [ 90%]  (Sampling)
## Chain 1: Iteration: 8000 / 8000 [100%]  (Sampling)
## Chain 1: 
## Chain 1:  Elapsed Time: 0.488781 seconds (Warm-up)
## Chain 1:                0.111974 seconds (Sampling)
## Chain 1:                0.600755 seconds (Total)
## Chain 1: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 2).
## Chain 2: 
## Chain 2: Gradient evaluation took 1.4e-05 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.14 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2: 
## Chain 2: 
## Chain 2: Iteration:    1 / 8000 [  0%]  (Warmup)
## Chain 2: Iteration:  800 / 8000 [ 10%]  (Warmup)
## Chain 2: Iteration: 1600 / 8000 [ 20%]  (Warmup)
## Chain 2: Iteration: 2400 / 8000 [ 30%]  (Warmup)
## Chain 2: Iteration: 3200 / 8000 [ 40%]  (Warmup)
## Chain 2: Iteration: 4000 / 8000 [ 50%]  (Warmup)
## Chain 2: Iteration: 4001 / 8000 [ 50%]  (Sampling)
## Chain 2: Iteration: 4800 / 8000 [ 60%]  (Sampling)
## Chain 2: Iteration: 5600 / 8000 [ 70%]  (Sampling)
## Chain 2: Iteration: 6400 / 8000 [ 80%]  (Sampling)
## Chain 2: Iteration: 7200 / 8000 [ 90%]  (Sampling)
## Chain 2: Iteration: 8000 / 8000 [100%]  (Sampling)
## Chain 2: 
## Chain 2:  Elapsed Time: 0.332337 seconds (Warm-up)
## Chain 2:                0.110941 seconds (Sampling)
## Chain 2:                0.443278 seconds (Total)
## Chain 2: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 3).
## Chain 3: 
## Chain 3: Gradient evaluation took 1.2e-05 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.12 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3: 
## Chain 3: 
## Chain 3: Iteration:    1 / 8000 [  0%]  (Warmup)
## Chain 3: Iteration:  800 / 8000 [ 10%]  (Warmup)
## Chain 3: Iteration: 1600 / 8000 [ 20%]  (Warmup)
## Chain 3: Iteration: 2400 / 8000 [ 30%]  (Warmup)
## Chain 3: Iteration: 3200 / 8000 [ 40%]  (Warmup)
## Chain 3: Iteration: 4000 / 8000 [ 50%]  (Warmup)
## Chain 3: Iteration: 4001 / 8000 [ 50%]  (Sampling)
## Chain 3: Iteration: 4800 / 8000 [ 60%]  (Sampling)
## Chain 3: Iteration: 5600 / 8000 [ 70%]  (Sampling)
## Chain 3: Iteration: 6400 / 8000 [ 80%]  (Sampling)
## Chain 3: Iteration: 7200 / 8000 [ 90%]  (Sampling)
## Chain 3: Iteration: 8000 / 8000 [100%]  (Sampling)
## Chain 3: 
## Chain 3:  Elapsed Time: 0.309901 seconds (Warm-up)
## Chain 3:                0.116207 seconds (Sampling)
## Chain 3:                0.426108 seconds (Total)
## Chain 3: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 4).
## Chain 4: 
## Chain 4: Gradient evaluation took 1.3e-05 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.13 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4: 
## Chain 4: 
## Chain 4: Iteration:    1 / 8000 [  0%]  (Warmup)
## Chain 4: Iteration:  800 / 8000 [ 10%]  (Warmup)
## Chain 4: Iteration: 1600 / 8000 [ 20%]  (Warmup)
## Chain 4: Iteration: 2400 / 8000 [ 30%]  (Warmup)
## Chain 4: Iteration: 3200 / 8000 [ 40%]  (Warmup)
## Chain 4: Iteration: 4000 / 8000 [ 50%]  (Warmup)
## Chain 4: Iteration: 4001 / 8000 [ 50%]  (Sampling)
## Chain 4: Iteration: 4800 / 8000 [ 60%]  (Sampling)
## Chain 4: Iteration: 5600 / 8000 [ 70%]  (Sampling)
## Chain 4: Iteration: 6400 / 8000 [ 80%]  (Sampling)
## Chain 4: Iteration: 7200 / 8000 [ 90%]  (Sampling)
## Chain 4: Iteration: 8000 / 8000 [100%]  (Sampling)
## Chain 4: 
## Chain 4:  Elapsed Time: 0.356583 seconds (Warm-up)
## Chain 4:                0.117071 seconds (Sampling)
## Chain 4:                0.473654 seconds (Total)
## Chain 4:

Below are the simulated datasets:

bikes |> 
  add_fitted_draws(bikes_model2, n = 100) |> 
  ggplot(aes(x = windspeed, y = rides)) +
  geom_line(aes(y = .value, group = .draw), alpha = .15) +
  geom_point(data = bikes, size = .05)
## Warning: `fitted_draws` and `add_fitted_draws` are deprecated as their names were confusing.
## Use [add_]epred_draws() to get the expectation of the posterior predictive.
## Use [add_]linpred_draws() to get the distribution of the linear predictor.
## For example, you used [add_]fitted_draws(..., scale = "response"), which
## means you most likely want [add_]epred_draws(...).

Here we can see a definite strong negative relationship between the two variables, but with a high variability (particularly regarding the intercept).

d- Below is the plot of rides v windspeed:

ggplot(bikes, aes(x = windspeed, y = rides)) +
  geom_point(size = .5) +
  geom_smooth(method = "lm")
## `geom_smooth()` using formula 'y ~ x'

Here it looks like the relationship, while still negative, is moderate as opposed to very strong like in the prior.

Exercise 9.15

First we can simulate the posterior via the update shortcut:

bikes_model2_posterior <- update(bikes_model2, prior_PD = FALSE)
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 1).
## Chain 1: 
## Chain 1: Gradient evaluation took 2.2e-05 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.22 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1: 
## Chain 1: 
## Chain 1: Iteration:    1 / 8000 [  0%]  (Warmup)
## Chain 1: Iteration:  800 / 8000 [ 10%]  (Warmup)
## Chain 1: Iteration: 1600 / 8000 [ 20%]  (Warmup)
## Chain 1: Iteration: 2400 / 8000 [ 30%]  (Warmup)
## Chain 1: Iteration: 3200 / 8000 [ 40%]  (Warmup)
## Chain 1: Iteration: 4000 / 8000 [ 50%]  (Warmup)
## Chain 1: Iteration: 4001 / 8000 [ 50%]  (Sampling)
## Chain 1: Iteration: 4800 / 8000 [ 60%]  (Sampling)
## Chain 1: Iteration: 5600 / 8000 [ 70%]  (Sampling)
## Chain 1: Iteration: 6400 / 8000 [ 80%]  (Sampling)
## Chain 1: Iteration: 7200 / 8000 [ 90%]  (Sampling)
## Chain 1: Iteration: 8000 / 8000 [100%]  (Sampling)
## Chain 1: 
## Chain 1:  Elapsed Time: 0.260376 seconds (Warm-up)
## Chain 1:                0.331577 seconds (Sampling)
## Chain 1:                0.591953 seconds (Total)
## Chain 1: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 2).
## Chain 2: 
## Chain 2: Gradient evaluation took 1.9e-05 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.19 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2: 
## Chain 2: 
## Chain 2: Iteration:    1 / 8000 [  0%]  (Warmup)
## Chain 2: Iteration:  800 / 8000 [ 10%]  (Warmup)
## Chain 2: Iteration: 1600 / 8000 [ 20%]  (Warmup)
## Chain 2: Iteration: 2400 / 8000 [ 30%]  (Warmup)
## Chain 2: Iteration: 3200 / 8000 [ 40%]  (Warmup)
## Chain 2: Iteration: 4000 / 8000 [ 50%]  (Warmup)
## Chain 2: Iteration: 4001 / 8000 [ 50%]  (Sampling)
## Chain 2: Iteration: 4800 / 8000 [ 60%]  (Sampling)
## Chain 2: Iteration: 5600 / 8000 [ 70%]  (Sampling)
## Chain 2: Iteration: 6400 / 8000 [ 80%]  (Sampling)
## Chain 2: Iteration: 7200 / 8000 [ 90%]  (Sampling)
## Chain 2: Iteration: 8000 / 8000 [100%]  (Sampling)
## Chain 2: 
## Chain 2:  Elapsed Time: 0.449084 seconds (Warm-up)
## Chain 2:                0.338495 seconds (Sampling)
## Chain 2:                0.787579 seconds (Total)
## Chain 2: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 3).
## Chain 3: 
## Chain 3: Gradient evaluation took 1.9e-05 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.19 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3: 
## Chain 3: 
## Chain 3: Iteration:    1 / 8000 [  0%]  (Warmup)
## Chain 3: Iteration:  800 / 8000 [ 10%]  (Warmup)
## Chain 3: Iteration: 1600 / 8000 [ 20%]  (Warmup)
## Chain 3: Iteration: 2400 / 8000 [ 30%]  (Warmup)
## Chain 3: Iteration: 3200 / 8000 [ 40%]  (Warmup)
## Chain 3: Iteration: 4000 / 8000 [ 50%]  (Warmup)
## Chain 3: Iteration: 4001 / 8000 [ 50%]  (Sampling)
## Chain 3: Iteration: 4800 / 8000 [ 60%]  (Sampling)
## Chain 3: Iteration: 5600 / 8000 [ 70%]  (Sampling)
## Chain 3: Iteration: 6400 / 8000 [ 80%]  (Sampling)
## Chain 3: Iteration: 7200 / 8000 [ 90%]  (Sampling)
## Chain 3: Iteration: 8000 / 8000 [100%]  (Sampling)
## Chain 3: 
## Chain 3:  Elapsed Time: 0.320083 seconds (Warm-up)
## Chain 3:                0.331701 seconds (Sampling)
## Chain 3:                0.651784 seconds (Total)
## Chain 3: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 4).
## Chain 4: 
## Chain 4: Gradient evaluation took 1.9e-05 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.19 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4: 
## Chain 4: 
## Chain 4: Iteration:    1 / 8000 [  0%]  (Warmup)
## Chain 4: Iteration:  800 / 8000 [ 10%]  (Warmup)
## Chain 4: Iteration: 1600 / 8000 [ 20%]  (Warmup)
## Chain 4: Iteration: 2400 / 8000 [ 30%]  (Warmup)
## Chain 4: Iteration: 3200 / 8000 [ 40%]  (Warmup)
## Chain 4: Iteration: 4000 / 8000 [ 50%]  (Warmup)
## Chain 4: Iteration: 4001 / 8000 [ 50%]  (Sampling)
## Chain 4: Iteration: 4800 / 8000 [ 60%]  (Sampling)
## Chain 4: Iteration: 5600 / 8000 [ 70%]  (Sampling)
## Chain 4: Iteration: 6400 / 8000 [ 80%]  (Sampling)
## Chain 4: Iteration: 7200 / 8000 [ 90%]  (Sampling)
## Chain 4: Iteration: 8000 / 8000 [100%]  (Sampling)
## Chain 4: 
## Chain 4:  Elapsed Time: 0.747126 seconds (Warm-up)
## Chain 4:                0.32456 seconds (Sampling)
## Chain 4:                1.07169 seconds (Total)
## Chain 4:

Density plot:

mcmc_dens_overlay(bikes_model2_posterior)

Now we can plot this against the data:

bikes |> 
  add_fitted_draws(bikes_model2_posterior, n = 100) |> 
  ggplot(aes(x = windspeed, y = rides)) +
  geom_line(aes(y = .value, group = .draw), alpha = .15) +
  geom_point(data = bikes, size = .05)
## Warning: `fitted_draws` and `add_fitted_draws` are deprecated as their names were confusing.
## Use [add_]epred_draws() to get the expectation of the posterior predictive.
## Use [add_]linpred_draws() to get the distribution of the linear predictor.
## For example, you used [add_]fitted_draws(..., scale = "response"), which
## means you most likely want [add_]epred_draws(...).

The posterior continues to show a strong negative relationship (between the moderate one of the real data and the very strong one of the prior).

Exercise 9.16

a- Below is the simulation of the normal regression prior (since the problem said to use weakly informative priors, we don’t need to specify them for this outside of the intercept since R will add in weakly informative priors automatically):

penguins_model <- stan_glm(flipper_length_mm ~ bill_length_mm, data = penguins_bayes,
                       family = gaussian,
                       prior_intercept = normal(200, 25),
                       chains = 4, iter = 10000, seed = 12345, prior_PD=TRUE)
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 1).
## Chain 1: 
## Chain 1: Gradient evaluation took 1.4e-05 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.14 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1: 
## Chain 1: 
## Chain 1: Iteration:    1 / 10000 [  0%]  (Warmup)
## Chain 1: Iteration: 1000 / 10000 [ 10%]  (Warmup)
## Chain 1: Iteration: 2000 / 10000 [ 20%]  (Warmup)
## Chain 1: Iteration: 3000 / 10000 [ 30%]  (Warmup)
## Chain 1: Iteration: 4000 / 10000 [ 40%]  (Warmup)
## Chain 1: Iteration: 5000 / 10000 [ 50%]  (Warmup)
## Chain 1: Iteration: 5001 / 10000 [ 50%]  (Sampling)
## Chain 1: Iteration: 6000 / 10000 [ 60%]  (Sampling)
## Chain 1: Iteration: 7000 / 10000 [ 70%]  (Sampling)
## Chain 1: Iteration: 8000 / 10000 [ 80%]  (Sampling)
## Chain 1: Iteration: 9000 / 10000 [ 90%]  (Sampling)
## Chain 1: Iteration: 10000 / 10000 [100%]  (Sampling)
## Chain 1: 
## Chain 1:  Elapsed Time: 0.133844 seconds (Warm-up)
## Chain 1:                0.165019 seconds (Sampling)
## Chain 1:                0.298863 seconds (Total)
## Chain 1: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 2).
## Chain 2: 
## Chain 2: Gradient evaluation took 1.6e-05 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.16 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2: 
## Chain 2: 
## Chain 2: Iteration:    1 / 10000 [  0%]  (Warmup)
## Chain 2: Iteration: 1000 / 10000 [ 10%]  (Warmup)
## Chain 2: Iteration: 2000 / 10000 [ 20%]  (Warmup)
## Chain 2: Iteration: 3000 / 10000 [ 30%]  (Warmup)
## Chain 2: Iteration: 4000 / 10000 [ 40%]  (Warmup)
## Chain 2: Iteration: 5000 / 10000 [ 50%]  (Warmup)
## Chain 2: Iteration: 5001 / 10000 [ 50%]  (Sampling)
## Chain 2: Iteration: 6000 / 10000 [ 60%]  (Sampling)
## Chain 2: Iteration: 7000 / 10000 [ 70%]  (Sampling)
## Chain 2: Iteration: 8000 / 10000 [ 80%]  (Sampling)
## Chain 2: Iteration: 9000 / 10000 [ 90%]  (Sampling)
## Chain 2: Iteration: 10000 / 10000 [100%]  (Sampling)
## Chain 2: 
## Chain 2:  Elapsed Time: 0.146875 seconds (Warm-up)
## Chain 2:                0.161329 seconds (Sampling)
## Chain 2:                0.308204 seconds (Total)
## Chain 2: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 3).
## Chain 3: 
## Chain 3: Gradient evaluation took 1.5e-05 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.15 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3: 
## Chain 3: 
## Chain 3: Iteration:    1 / 10000 [  0%]  (Warmup)
## Chain 3: Iteration: 1000 / 10000 [ 10%]  (Warmup)
## Chain 3: Iteration: 2000 / 10000 [ 20%]  (Warmup)
## Chain 3: Iteration: 3000 / 10000 [ 30%]  (Warmup)
## Chain 3: Iteration: 4000 / 10000 [ 40%]  (Warmup)
## Chain 3: Iteration: 5000 / 10000 [ 50%]  (Warmup)
## Chain 3: Iteration: 5001 / 10000 [ 50%]  (Sampling)
## Chain 3: Iteration: 6000 / 10000 [ 60%]  (Sampling)
## Chain 3: Iteration: 7000 / 10000 [ 70%]  (Sampling)
## Chain 3: Iteration: 8000 / 10000 [ 80%]  (Sampling)
## Chain 3: Iteration: 9000 / 10000 [ 90%]  (Sampling)
## Chain 3: Iteration: 10000 / 10000 [100%]  (Sampling)
## Chain 3: 
## Chain 3:  Elapsed Time: 0.136121 seconds (Warm-up)
## Chain 3:                0.144449 seconds (Sampling)
## Chain 3:                0.28057 seconds (Total)
## Chain 3: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 4).
## Chain 4: 
## Chain 4: Gradient evaluation took 1.5e-05 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.15 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4: 
## Chain 4: 
## Chain 4: Iteration:    1 / 10000 [  0%]  (Warmup)
## Chain 4: Iteration: 1000 / 10000 [ 10%]  (Warmup)
## Chain 4: Iteration: 2000 / 10000 [ 20%]  (Warmup)
## Chain 4: Iteration: 3000 / 10000 [ 30%]  (Warmup)
## Chain 4: Iteration: 4000 / 10000 [ 40%]  (Warmup)
## Chain 4: Iteration: 5000 / 10000 [ 50%]  (Warmup)
## Chain 4: Iteration: 5001 / 10000 [ 50%]  (Sampling)
## Chain 4: Iteration: 6000 / 10000 [ 60%]  (Sampling)
## Chain 4: Iteration: 7000 / 10000 [ 70%]  (Sampling)
## Chain 4: Iteration: 8000 / 10000 [ 80%]  (Sampling)
## Chain 4: Iteration: 9000 / 10000 [ 90%]  (Sampling)
## Chain 4: Iteration: 10000 / 10000 [100%]  (Sampling)
## Chain 4: 
## Chain 4:  Elapsed Time: 0.143565 seconds (Warm-up)
## Chain 4:                0.141875 seconds (Sampling)
## Chain 4:                0.28544 seconds (Total)
## Chain 4:

b-

prior_summary(penguins_model)
## Priors for model 'penguins_model' 
## ------
## Intercept (after predictors centered)
##  ~ normal(location = 200, scale = 25)
## 
## Coefficients
##   Specified prior:
##     ~ normal(location = 0, scale = 2.5)
##   Adjusted prior:
##     ~ normal(location = 0, scale = 6.4)
## 
## Auxiliary (sigma)
##   Specified prior:
##     ~ exponential(rate = 1)
##   Adjusted prior:
##     ~ exponential(rate = 0.071)
## ------
## See help('prior_summary.stanreg') for more details

Given these values, below is the prior model defined: Y|Bo, B1, sigma ~ N(u, sigma^2) and u = Bo + B1(X) Bo ~ N(200, 25^2) B1 ~ N(0, 6.4^2) sigma ~ Exponential(.071)

c- Below are the 100 prior plausible model lines (it looks as though I forgot to remove the NAs earlier but I believe the model should remove them for me! I will need to remove them here however):

penguins_bayes |> filter(is.na(bill_length_mm) == FALSE & is.na(flipper_length_mm) == FALSE) |> 
  add_fitted_draws(penguins_model, n = 100) |> 
  ggplot(aes(x = bill_length_mm, y = flipper_length_mm)) +
  geom_line(aes(y = .value, group = .draw), alpha = .15)
## Warning: `fitted_draws` and `add_fitted_draws` are deprecated as their names were confusing.
## Use [add_]epred_draws() to get the expectation of the posterior predictive.
## Use [add_]linpred_draws() to get the distribution of the linear predictor.
## For example, you used [add_]fitted_draws(..., scale = "response"), which
## means you most likely want [add_]epred_draws(...).

Oh boy, this is messy! Looks like there may be no relation here in the prior- let’s try the 4 prior simulated datasets:

set.seed(12345)
penguins_bayes |> 
  filter(is.na(bill_length_mm)==FALSE & is.na(flipper_length_mm)==FALSE ) |> 
  add_predicted_draws(penguins_model, n = 4) |> 
  ggplot(aes(x = bill_length_mm, y = flipper_length_mm)) +
    geom_point(aes(y = .prediction, group = .draw)) + 
    facet_wrap(~ .draw)
## Warning: 
## In add_predicted_draws(): The `n` argument is a deprecated alias for `ndraws`.
## Use the `ndraws` argument instead.
## See help("tidybayes-deprecated").

d- Overall, our prior understanding is very weak and seems to show a lot of uncertainty around the relationship (both positive and negative relationships are plausible).

Exercise 9.17

a- Below is the plot of the relationship in the data:

penguins_bayes |> 
  ggplot(aes(bill_length_mm, flipper_length_mm)) +
  geom_point() +
  geom_smooth(method = lm)
## `geom_smooth()` using formula 'y ~ x'
## Warning: Removed 2 rows containing non-finite values (stat_smooth).
## Warning: Removed 2 rows containing missing values (geom_point).

Here in the data there appears to be a strong positive relationship between bill and flipper length!

b- I believe a normal regression would work here, as the relationship is linear and the variance does not seem to be out of the ordinary.

Exercise 9.18

a- Here we can update our model to simulate the posterior:

set.seed(12345)
penguins_model_posterior <- update(penguins_model, prior_PD = FALSE)
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 1).
## Chain 1: 
## Chain 1: Gradient evaluation took 1.8e-05 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.18 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1: 
## Chain 1: 
## Chain 1: Iteration:    1 / 10000 [  0%]  (Warmup)
## Chain 1: Iteration: 1000 / 10000 [ 10%]  (Warmup)
## Chain 1: Iteration: 2000 / 10000 [ 20%]  (Warmup)
## Chain 1: Iteration: 3000 / 10000 [ 30%]  (Warmup)
## Chain 1: Iteration: 4000 / 10000 [ 40%]  (Warmup)
## Chain 1: Iteration: 5000 / 10000 [ 50%]  (Warmup)
## Chain 1: Iteration: 5001 / 10000 [ 50%]  (Sampling)
## Chain 1: Iteration: 6000 / 10000 [ 60%]  (Sampling)
## Chain 1: Iteration: 7000 / 10000 [ 70%]  (Sampling)
## Chain 1: Iteration: 8000 / 10000 [ 80%]  (Sampling)
## Chain 1: Iteration: 9000 / 10000 [ 90%]  (Sampling)
## Chain 1: Iteration: 10000 / 10000 [100%]  (Sampling)
## Chain 1: 
## Chain 1:  Elapsed Time: 0.193182 seconds (Warm-up)
## Chain 1:                0.388739 seconds (Sampling)
## Chain 1:                0.581921 seconds (Total)
## Chain 1: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 2).
## Chain 2: 
## Chain 2: Gradient evaluation took 1.6e-05 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.16 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2: 
## Chain 2: 
## Chain 2: Iteration:    1 / 10000 [  0%]  (Warmup)
## Chain 2: Iteration: 1000 / 10000 [ 10%]  (Warmup)
## Chain 2: Iteration: 2000 / 10000 [ 20%]  (Warmup)
## Chain 2: Iteration: 3000 / 10000 [ 30%]  (Warmup)
## Chain 2: Iteration: 4000 / 10000 [ 40%]  (Warmup)
## Chain 2: Iteration: 5000 / 10000 [ 50%]  (Warmup)
## Chain 2: Iteration: 5001 / 10000 [ 50%]  (Sampling)
## Chain 2: Iteration: 6000 / 10000 [ 60%]  (Sampling)
## Chain 2: Iteration: 7000 / 10000 [ 70%]  (Sampling)
## Chain 2: Iteration: 8000 / 10000 [ 80%]  (Sampling)
## Chain 2: Iteration: 9000 / 10000 [ 90%]  (Sampling)
## Chain 2: Iteration: 10000 / 10000 [100%]  (Sampling)
## Chain 2: 
## Chain 2:  Elapsed Time: 0.190865 seconds (Warm-up)
## Chain 2:                0.329649 seconds (Sampling)
## Chain 2:                0.520514 seconds (Total)
## Chain 2: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 3).
## Chain 3: 
## Chain 3: Gradient evaluation took 2e-05 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.2 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3: 
## Chain 3: 
## Chain 3: Iteration:    1 / 10000 [  0%]  (Warmup)
## Chain 3: Iteration: 1000 / 10000 [ 10%]  (Warmup)
## Chain 3: Iteration: 2000 / 10000 [ 20%]  (Warmup)
## Chain 3: Iteration: 3000 / 10000 [ 30%]  (Warmup)
## Chain 3: Iteration: 4000 / 10000 [ 40%]  (Warmup)
## Chain 3: Iteration: 5000 / 10000 [ 50%]  (Warmup)
## Chain 3: Iteration: 5001 / 10000 [ 50%]  (Sampling)
## Chain 3: Iteration: 6000 / 10000 [ 60%]  (Sampling)
## Chain 3: Iteration: 7000 / 10000 [ 70%]  (Sampling)
## Chain 3: Iteration: 8000 / 10000 [ 80%]  (Sampling)
## Chain 3: Iteration: 9000 / 10000 [ 90%]  (Sampling)
## Chain 3: Iteration: 10000 / 10000 [100%]  (Sampling)
## Chain 3: 
## Chain 3:  Elapsed Time: 0.192279 seconds (Warm-up)
## Chain 3:                0.354882 seconds (Sampling)
## Chain 3:                0.547161 seconds (Total)
## Chain 3: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 4).
## Chain 4: 
## Chain 4: Gradient evaluation took 1.9e-05 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.19 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4: 
## Chain 4: 
## Chain 4: Iteration:    1 / 10000 [  0%]  (Warmup)
## Chain 4: Iteration: 1000 / 10000 [ 10%]  (Warmup)
## Chain 4: Iteration: 2000 / 10000 [ 20%]  (Warmup)
## Chain 4: Iteration: 3000 / 10000 [ 30%]  (Warmup)
## Chain 4: Iteration: 4000 / 10000 [ 40%]  (Warmup)
## Chain 4: Iteration: 5000 / 10000 [ 50%]  (Warmup)
## Chain 4: Iteration: 5001 / 10000 [ 50%]  (Sampling)
## Chain 4: Iteration: 6000 / 10000 [ 60%]  (Sampling)
## Chain 4: Iteration: 7000 / 10000 [ 70%]  (Sampling)
## Chain 4: Iteration: 8000 / 10000 [ 80%]  (Sampling)
## Chain 4: Iteration: 9000 / 10000 [ 90%]  (Sampling)
## Chain 4: Iteration: 10000 / 10000 [100%]  (Sampling)
## Chain 4: 
## Chain 4:  Elapsed Time: 0.190589 seconds (Warm-up)
## Chain 4:                0.347603 seconds (Sampling)
## Chain 4:                0.538192 seconds (Total)
## Chain 4:

b- Below are 100 posterior model lines (removing NAs here):

penguins_bayes |>  
  filter(is.na(bill_length_mm)==FALSE & is.na(flipper_length_mm)==FALSE ) |> 
  add_fitted_draws(penguins_model_posterior, n = 100) |> 
  ggplot(aes(x = bill_length_mm, y = flipper_length_mm )) +
    geom_line(aes(y = .value, group = .draw), alpha = 0.15)
## Warning: `fitted_draws` and `add_fitted_draws` are deprecated as their names were confusing.
## Use [add_]epred_draws() to get the expectation of the posterior predictive.
## Use [add_]linpred_draws() to get the distribution of the linear predictor.
## For example, you used [add_]fitted_draws(..., scale = "response"), which
## means you most likely want [add_]epred_draws(...).

The posterior seems pretty stable here and shows a strong relationship!

c- Here is the tidy summary with 90% CI:

tidy(penguins_model_posterior, effects = c("fixed", "aux"),
     conf.int = TRUE, conf.level = 0.90)
## # A tibble: 4 × 5
##   term           estimate std.error conf.low conf.high
##   <chr>             <dbl>     <dbl>    <dbl>     <dbl>
## 1 (Intercept)      127.       4.57    119.      134.  
## 2 bill_length_mm     1.69     0.103     1.52      1.86
## 3 sigma             10.6      0.411    10.0      11.4 
## 4 mean_PPD         201.       0.816   200.      202.

d- The 90% CI for the bill length coefficient is between 1.52 and 1.86. This indicates that, on average, the increase in MM for flipper length for every 1 MM increase in bill length is between 1.52 and 1.86.

e- Yes we do; since 0 and negative values were not included in the CI, we can safely say that there is evidence that penguins with longer bills tend to have longer flippers.

Exercise 9.19

a- First we have to turn the model into a data frame:

penguins_model_df <- as.data.frame(penguins_model_posterior)
first_set <- head(penguins_model_df, 1)
first_set
##   (Intercept) bill_length_mm    sigma
## 1    129.2959       1.641349 11.10854

Now that we have these values, we will calculate mu, and create our simulation:

mu <- first_set$`(Intercept)` + first_set$bill_length_mm*51
set.seed(12345)
y_new <- rnorm(1, mean = mu, sd = first_set$sigma)
predict_51 <- penguins_model_df |> 
  mutate(mu = `(Intercept)` + bill_length_mm*51,
         y_new = rnorm(20000, mean = mu, sd = sigma))
head(predict_51, 4)
##   (Intercept) bill_length_mm    sigma       mu    y_new
## 1    129.2959       1.641349 11.10854 213.0047 220.8858
## 2    131.7835       1.587732 10.79382 212.7578 211.5780
## 3    122.7608       1.781570 10.73543 213.6209 208.7524
## 4    124.2860       1.727502 10.80971 212.3886 218.9380

The mu column shows the model for typical flipper length for a penguin with bill length of 51 mm, while the y_new column shows the model for Pablo (the individual penguin with bill length of 51 mm).

b- Below are the two density plots:

#mu plot
ggplot(predict_51, aes(x = mu)) + 
  geom_density()

#y_new plot
ggplot(predict_51, aes(x = y_new)) +
  geom_density()

Similar to the last time we plotted the individual v average posterior plots, the spread for the individual (Pablo) is much larger than that of the average/overall.

c- 80% posterior prediction interval for Pablo flipper length:

predict_51 |> summarize(lower_new = quantile(y_new, .1),
                        upper_new = quantile(y_new,.9))
##   lower_new upper_new
## 1   199.254  226.6436

The CI here is from 199.2 to 266.6

d- The 80% credible interval for typical flipper length for penguins with 51mm bills would be much narrower- with individual posteriors, we have to add a large amount of variability, however when it is an average of a group then a lot of the variability is already taken care of.

e- Using posterior_predict to confirm earlier results:

#first we will recreate the density plots
set.seed(12345)
shortcut_prediction <- posterior_predict(penguins_model_posterior, newdata = data.frame(bill_length_mm = 51))
mcmc_dens(shortcut_prediction)

The density plot here is very similar to the one we made in part b (for Pablo the penguin). Now we will recreate the CI:

posterior_interval(shortcut_prediction, prob = .8)
##        10%      90%
## 1 199.2325 226.5126

These values are very similar to the CI we created earlier!

Exercise 9.20

a- Their prior understanding is that, for every increase of 1 gram in a penguin, the average flipper length will increase by .01 mm, but it could be between .006 and .014.

b- Below is the plot of flipper length and body mass:

ggplot(penguins_bayes, aes(x = body_mass_g, y = flipper_length_mm)) + 
  geom_point(size = 0.5) + 
  geom_smooth(method = "lm")
## `geom_smooth()` using formula 'y ~ x'
## Warning: Removed 2 rows containing non-finite values (stat_smooth).
## Warning: Removed 2 rows containing missing values (geom_point).

Among the sampled penguins, there does appear to be a very strong positive relationship between body mass and flipper length.

c- I am pretty unsure, but I would guess that the sigma parameter is bigger when X=body mass, because even though the relationship appears to be stronger, the data points are a bit more scattered on this plot than the one we did earlier comparing bill length and flipper length.

d- Model:

penguins_model_2 <- stan_glm(flipper_length_mm ~ body_mass_g, data=penguins_bayes, 
  family=gaussian,
  prior = normal (.01,.002), 
  chains = 4, iter = 10000, seed=12345)
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 1).
## Chain 1: 
## Chain 1: Gradient evaluation took 2.2e-05 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.22 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1: 
## Chain 1: 
## Chain 1: Iteration:    1 / 10000 [  0%]  (Warmup)
## Chain 1: Iteration: 1000 / 10000 [ 10%]  (Warmup)
## Chain 1: Iteration: 2000 / 10000 [ 20%]  (Warmup)
## Chain 1: Iteration: 3000 / 10000 [ 30%]  (Warmup)
## Chain 1: Iteration: 4000 / 10000 [ 40%]  (Warmup)
## Chain 1: Iteration: 5000 / 10000 [ 50%]  (Warmup)
## Chain 1: Iteration: 5001 / 10000 [ 50%]  (Sampling)
## Chain 1: Iteration: 6000 / 10000 [ 60%]  (Sampling)
## Chain 1: Iteration: 7000 / 10000 [ 70%]  (Sampling)
## Chain 1: Iteration: 8000 / 10000 [ 80%]  (Sampling)
## Chain 1: Iteration: 9000 / 10000 [ 90%]  (Sampling)
## Chain 1: Iteration: 10000 / 10000 [100%]  (Sampling)
## Chain 1: 
## Chain 1:  Elapsed Time: 0.216463 seconds (Warm-up)
## Chain 1:                0.355844 seconds (Sampling)
## Chain 1:                0.572307 seconds (Total)
## Chain 1: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 2).
## Chain 2: 
## Chain 2: Gradient evaluation took 1.9e-05 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.19 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2: 
## Chain 2: 
## Chain 2: Iteration:    1 / 10000 [  0%]  (Warmup)
## Chain 2: Iteration: 1000 / 10000 [ 10%]  (Warmup)
## Chain 2: Iteration: 2000 / 10000 [ 20%]  (Warmup)
## Chain 2: Iteration: 3000 / 10000 [ 30%]  (Warmup)
## Chain 2: Iteration: 4000 / 10000 [ 40%]  (Warmup)
## Chain 2: Iteration: 5000 / 10000 [ 50%]  (Warmup)
## Chain 2: Iteration: 5001 / 10000 [ 50%]  (Sampling)
## Chain 2: Iteration: 6000 / 10000 [ 60%]  (Sampling)
## Chain 2: Iteration: 7000 / 10000 [ 70%]  (Sampling)
## Chain 2: Iteration: 8000 / 10000 [ 80%]  (Sampling)
## Chain 2: Iteration: 9000 / 10000 [ 90%]  (Sampling)
## Chain 2: Iteration: 10000 / 10000 [100%]  (Sampling)
## Chain 2: 
## Chain 2:  Elapsed Time: 0.20543 seconds (Warm-up)
## Chain 2:                0.346951 seconds (Sampling)
## Chain 2:                0.552381 seconds (Total)
## Chain 2: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 3).
## Chain 3: 
## Chain 3: Gradient evaluation took 1.9e-05 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.19 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3: 
## Chain 3: 
## Chain 3: Iteration:    1 / 10000 [  0%]  (Warmup)
## Chain 3: Iteration: 1000 / 10000 [ 10%]  (Warmup)
## Chain 3: Iteration: 2000 / 10000 [ 20%]  (Warmup)
## Chain 3: Iteration: 3000 / 10000 [ 30%]  (Warmup)
## Chain 3: Iteration: 4000 / 10000 [ 40%]  (Warmup)
## Chain 3: Iteration: 5000 / 10000 [ 50%]  (Warmup)
## Chain 3: Iteration: 5001 / 10000 [ 50%]  (Sampling)
## Chain 3: Iteration: 6000 / 10000 [ 60%]  (Sampling)
## Chain 3: Iteration: 7000 / 10000 [ 70%]  (Sampling)
## Chain 3: Iteration: 8000 / 10000 [ 80%]  (Sampling)
## Chain 3: Iteration: 9000 / 10000 [ 90%]  (Sampling)
## Chain 3: Iteration: 10000 / 10000 [100%]  (Sampling)
## Chain 3: 
## Chain 3:  Elapsed Time: 0.203799 seconds (Warm-up)
## Chain 3:                0.346215 seconds (Sampling)
## Chain 3:                0.550014 seconds (Total)
## Chain 3: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 4).
## Chain 4: 
## Chain 4: Gradient evaluation took 2.1e-05 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.21 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4: 
## Chain 4: 
## Chain 4: Iteration:    1 / 10000 [  0%]  (Warmup)
## Chain 4: Iteration: 1000 / 10000 [ 10%]  (Warmup)
## Chain 4: Iteration: 2000 / 10000 [ 20%]  (Warmup)
## Chain 4: Iteration: 3000 / 10000 [ 30%]  (Warmup)
## Chain 4: Iteration: 4000 / 10000 [ 40%]  (Warmup)
## Chain 4: Iteration: 5000 / 10000 [ 50%]  (Warmup)
## Chain 4: Iteration: 5001 / 10000 [ 50%]  (Sampling)
## Chain 4: Iteration: 6000 / 10000 [ 60%]  (Sampling)
## Chain 4: Iteration: 7000 / 10000 [ 70%]  (Sampling)
## Chain 4: Iteration: 8000 / 10000 [ 80%]  (Sampling)
## Chain 4: Iteration: 9000 / 10000 [ 90%]  (Sampling)
## Chain 4: Iteration: 10000 / 10000 [100%]  (Sampling)
## Chain 4: 
## Chain 4:  Elapsed Time: 0.214776 seconds (Warm-up)
## Chain 4:                0.35073 seconds (Sampling)
## Chain 4:                0.565506 seconds (Total)
## Chain 4:

e- Below is the plot for the posterior model of the B1 coefficient:

mcmc_dens_overlay(penguins_model_2, pars=c("body_mass_g"))

Compared to the prior, the posterior has higher values so the relationship is stronger than the researchers originally expected.